Deep learning via Hessian-free optimization
نویسنده
چکیده
We develop a 2nd-order optimization method based on the “Hessian-free” approach, and apply it to training deep auto-encoders. Without using pre-training, we obtain results superior to those reported by Hinton & Salakhutdinov (2006) on the same tasks they considered. Our method is practical, easy to use, scales nicely to very large datasets, and isn’t limited in applicability to autoencoders, or any specific model class. We also discuss the issue of “pathological curvature” as a possible explanation for the difficulty of deeplearning and how 2nd-order optimization, and our method in particular, effectively deals with it.
منابع مشابه
Improved Preconditioner for Hessian Free Optimization
We investigate the use of Hessian Free optimization for learning deep autoencoders. One of the critical components in that algorithm is the choice of the preconditioner. We argue in this paper that the Jacobi preconditioner leads to faster optimization and we show how it can be accurately and efficiently estimated using a randomized algorithm.
متن کاملInvestigations on hessian-free optimization for cross-entropy training of deep neural networks
Context-dependent deep neural network HMMs have been shown to achieve recognition accuracy superior to Gaussian mixture models in a number of recent works. Typically, neural networks are optimized with stochastic gradient descent. On large datasets, stochastic gradient descent improves quickly during the beginning of the optimization. But since it does not make use of second order information, ...
متن کاملSaddle-free Hessian-free Optimization
Nonconvex optimization problems such as the ones in training deep neural networks suffer from a phenomenon called saddle point proliferation. This means that there are a vast number of high error saddle points present in the loss function. Second order methods have been tremendously successful and widely adopted in the convex optimization community, while their usefulness in deep learning remai...
متن کاملDocument for “ On Optimization Methods for Deep Learning ”
In our ICML paper titled “On optimization methods for deep learning”, we discussed the standard and sparse autoencoder model. However, due to space limitations in our paper, we were not able to present further details about the bases learned by the sparse autoencoder model, compare the standard autoencoder with the Hessian Free approach as described in (Martens, 2010) and analyze in detail the ...
متن کاملBlock-diagonal Hessian-free Optimization for Training Neural Networks
Second-order methods for neural network optimization have several advantages over methods based on first-order gradient descent, including better scaling to large mini-batch sizes and fewer updates needed for convergence. But they are rarely applied to deep learning in practice because of high computational cost and the need for model-dependent algorithmic variations. We introduce a variant of ...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2010